Analysis and Ground - truth Elements ) Format Framework †

نویسندگان

  • S. Pletschacher
  • A. Antonacopoulos
چکیده

There is a plethora of established and proposed document representation formats but none that can adequately support individual stages within an entire sequence of document image analysis methods (from document image enhancement to layout analysis to OCR) and their evaluation. This paper describes PAGE, a new XML-based page image representation framework that records information on image characteristics (image borders, geometric distortions and corresponding corrections, binarisation etc.) in addition to layout structure and page content. The suitability of the framework to the evaluation of entire workflows as well as individual stages has been extensively validated by using it in high-profile applications such as in public contemporary and historical ground-truthed datasets and in the ICDAR Page Segmentation competition series.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi Attribute Analysis of a Novel Compact UWB Antenna with Via-fed Elements for Dual Band Notch Function (RESEARCH NOTE)

A compact microstrip-fed antenna with dual notched bands is proposed. First, a simple basic configuration is presented for Ultra Wide Band (UWB) applications and then the dual band notched structure is extended from the UWB one. The basic structure of the UWB antenna consists of a simple square radiating patch and a ground plane with a wide square slot on back of the substrate. A semi-circle sh...

متن کامل

Participatory Classification in a System for Assessing Multimodal Transportation Patterns

There has been an increasing trend of performing inference on data collected by smartphones to provide context-aware location-based services. When this inference is performed using supervised analysis, these services need ground truth if high accuracies are desired. While accuracy is less of a concern for services targeted at individuals, it is important when individual data is aggregated for s...

متن کامل

A Framework for Constructing Benchmark Databases and Protocols for Retinopathy in Medical Image Analysis

We address performance evaluation practises for developing medical image analysis methods, and contribute to the practise to establish and to share databases of medical images with verified ground truth and solid evaluation protocols. This helps to develop better algorithms, to perform fair method comparisons, including the state-ofthe-art methods, and consequently, supports technology transfer...

متن کامل

Random Table and Its Ground Truth Automatic Generation: A Tool for Table Understanding Research

We developed a software tool to assist table understanding research. It can analyze any given table ground truth and generate documents that include similar table elements while have more variety on both table and non-table parts. Based on our novel content matching ground truthing idea, the table ground truth data for the generated table elements become available with little manual work. The v...

متن کامل

Neural Network Boundary Detection for 3D Vessel Segmentation

Conventionally, hand-crafted features are used to train machine learning algorithms, however choosing useful features is not a trivial task as they are very much data-dependent. Given raw image intensities as inputs, supervised neural networks (NNs) essentially learn useful features by adjusting the weights of its nodes using the back-propagation algorithm. In this paper we investigate the perf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010